Performance Benefits of DataMPI: A Case Study with BigDataBench

نویسندگان

  • Fan Liang
  • Chen Feng
  • Xiaoyi Lu
  • Zhiwei Xu
چکیده

Apache Hadoop and Spark are gaining prominence in Big Data processing and analytics. Both of them are widely deployed on Internet companies. On the other hand, high-performance data analysis requirements are causing academical and industrial communities to adopt state-of-the-art technologies in HPC to solve Big Data problems. Recently, we have proposed a key-value pair based communication library, DataMPI, which is extending MPI to support Hadoop/Spark-like Big Data Computing jobs. In this paper, we use BigDataBench, a Big Data benchmark suite, to do comprehensive studies on performance and resource utilization characterizations of Hadoop, Spark and DataMPI. From our experiments, we observe that the job execution time of DataMPI has up to 55% and 39% speedups compared with those of Hadoop and Spark, respectively. Most of the benefits come from the high-efficiency communication mechanisms in DataMPI. We also notice that the resource (CPU, memory, disk and network I/O) utilizations of DataMPI are also more efficient than those of the other two frameworks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating of staff’s performance with job satisfaction and Organizational Commitment approach by DEA (case study: Khuzestan Oxin Steel Company)

Evaluating of staff’s performance and accessing to optimized performance in an organization will be possible by creating factors such as job satisfaction and Organizational Commitment. In this paper, Data Envelope Analysis (DEA) was used in order to evaluate staffs’ performance in Khuzestan Oxin Steel Company. Also, a questionnaire was used in order to collect necessary data and information (Cr...

متن کامل

Potential Benefits and Downsides of External Healthcare Performance Evaluation Systems: Real-Life Perspectives on Iranian Hospital Evaluation and Accreditation Program

Background Performance evaluation is essential to quality improvement in healthcare. The current study has identified the potential pros and cons of external healthcare evaluation programs, utilizing them subsequently to look into the merits of a similar case in a developing country.   Methods A mixed method study employing both qualitative and quantitative data collection and analysis techniqu...

متن کامل

Improving Performance of Mining Equipment Through Enhancement of Speed Factor: A Case Study (Research Note)

Loading and hauling machineries are highly capital intensive equipment to procure, operate and maintain in surface mining operation. It must be borne in mind that with this huge and capital-intensive equipment, every second of its life time is absolutely important from the production and productivity point of view. As such, it is imperative to optimize the overall cycle time and speed factor of...

متن کامل

BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite

As architecture, system, data management, and machine learning communities pay greater attention to innovative big data and data-driven artificial intelligence (in short, AI) algorithms, architecture, and systems, the pressure of benchmarking rises. However, complexity, diversity, frequently changed workloads, and rapid evolution of big data, especially AI systems raise great challenges in benc...

متن کامل

Economic evaluation of strategies for energy consuming optimization with Fuzzy logic (The case study on Atisaz, Tehran)

One of the important energy consuming sectors of the country is buildings. In this research, we use the energy audit of the Atisaz complex in Tehran to evaluate the economic benefits of energy saving measures.  We evaluate measures such as: using the automatic damper, replacing the active existing chiller with the solar absorption one, using intelligent boiler-room and replacing the existing li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014